5 research outputs found

    On the automated compilation of UML notation to a VLIW chip multiprocessor

    Get PDF
    With the availability of more and more cores within architectures the process of extracting implicit and explicit parallelism in applications to fully utilise these cores is becoming complex. Implicit parallelism extraction is performed through the inclusion of intelligent software and hardware sections of tool chains although these reach their theoretical limit rather quickly. Due to this the concept of a method of allowing explicit parallelism to be performed as fast a possible has been investigated. This method enables application developers to perform creation and synchronisation of parallel sections of an application at a finer-grained level than previously possible, resulting in smaller sections of code being executed in parallel while still reducing overall execution time. Alongside explicit parallelism, a concept of high level design of applications destined for multicore systems was also investigated. As systems are getting larger it is becoming more difficult to design and track the full life-cycle of development. One method used to ease this process is to use a graphical design process to visualise the high level designs of such systems. One drawback in graphical design is the explicit nature in which systems are required to be generated, this was investigated, and using concepts already in use in text based programming languages, the generation of platform-independent models which are able to be specialised to multiple hardware architectures was developed. The explicit parallelism was performed using hardware elements to perform thread management, this resulted in speed ups of over 13 times when compared to threading libraries executed in software on commercially available processors. This allowed applications with large data dependent sections to be parallelised in small sections within the code resulting in a decrease of overall execution time. The modelling concepts resulted in the saving of between 40-50% of the time and effort required to generate platform-specific models while only incurring an overhead of up to 15% the execution cycles of these models designed for specific architectures

    VThreads: A novel VLIW chip multiprocessor with hardware-assisted PThreads

    Get PDF
    We discuss VThreads, a novel VLIW CMP with hardware-assisted shared-memory Thread support. VThreads supports Instruction Level Parallelism via static multiple-issue and Thread Level Parallelism via hardware-assisted POSIX Threads along with extensive customization. It allows the instantiation of tightlycoupled streaming accelerators and supports up to 7-address Multiple-Input, Multiple-Output instruction extensions. VThreads is designed in technology-independent Register-Transfer-Level VHDL and prototyped on 40 nm and 28 nm Field-Programmable gate arrays. It was evaluated against a PThreads-based multiprocessor based on the Sparc-V8 ISA. On a 65 nm ASIC implementation VThreads achieves up to x7.2 performance increase on synthetic benchmarks, x5 on a parallel Mandelbrot implementation, 66% better on a threaded JPEG implementation, 79% better on an edge-detection benchmark and ~13% improvement on DES compared to the Leon3MP CMP. In the range of 2 to 8 cores VThreads demonstrates a post-route (statistical) power reduction between 65% to 57% at an area increase of 1.2%-10% for 1-8 cores, compared to a similarly-configured Leon3MP CMP. This combination of micro-architectural features, scalability, extensibility, hardware support for low-latency PThreads, power efficiency and area make the processor an attractive proposition for low-power, deeply-embedded applications requiring minimum OS support

    BioThreads: a novel VLIW-based chip multiprocessor for accelerating biomedical image processing applications

    Get PDF
    We discuss BioThreads, a novel, configurable, extensible system-on-chip multiprocessor and its use in accelerating biomedical signal processing applications such as imaging photoplethysmography (IPPG). BioThreads is derived from the LE1 open-source VLIW chip multiprocessor and efficiently handles instruction, data and thread-level parallelism. In addition, it supports a novel mechanism for the dynamic creation, and allocation of software threads to uncommitted processor cores by implementing key POSIX Threads primitives directly in hardware, as custom instructions. In this study, the BioThreads core is used to accelerate the calculation of the oxygen saturation map of living tissue in an experimental setup consisting of a high speed image acquisition system, connected to an FPGA board and to a host system. Results demonstrate near-linear acceleration of the core kernels of the target blood perfusion assessment with increasing number of hardware threads. The BioThreads processor was implemented on both standard-cell and FPGA technologies; in the first case and for an issue width of two, full real-time performance is achieved with 4 cores whereas on a mid-range Xilinx Virtex6 device this is achieved with 10 dual-issue cores. An 8-core LE1 VLIW FPGA prototype of the system achieved 240 times faster execution time than the scalar Microblaze processor demonstrating the scalability of the proposed solution to a state-of-the-art FPGA vendor provided soft CPU core

    Additional file 1: Figure S1. of Evaluating the potential of gold, silver, and silica nanoparticles to saturate mononuclear phagocytic system tissues under repeat dosing conditions

    No full text
    Schematic of experimental design. Note NP treatment is for each of the three NPs tested (e.g. 56 animals for AuNPs, 56 animals for AgNPs, 56 animals for SiO2 NPs). Note that this scheme does not take into account the extra animals that were dosed to allow for sufficient animal numbers at the end of the study (e.g. due to mis-dosing or animal death). Figure S2. DLS stability evaluation of NP dosing solutions. NPs were dispersed into D5W at representative dosing concentrations. DLS histograms were recorded up to 72 h after initial dilution. Representative histograms for A) AuNPs B) AgNPs C) SiO2 NPs. Dosing took no longer than 3.5 h for any NP. Table S1. Staining template for splenocyte analysis. Table S2. Summary Incidence Tables for AuNPs. Table S2. Summary Incidence Tables for AgNPs. Table S4. Summary Incidence Tables for SiO2NPs. (DOC 664 kb

    Managing travel fatigue and jet lag in athletes: A review and consensus statement

    No full text
    Athletes are increasingly required to travel domestically and internationally, often resulting in travel fatigue and jet lag. Despite considerable agreement that travel fatigue and jet lag can be a real and impactful issue for athletes regarding performance and risk of illness and injury, evidence on optimal assessment and management is lacking. Therefore 26 researchers and/or clinicians with knowledge in travel fatigue, jet lag and sleep in the sports setting, formed an expert panel to formalise a review and consensus document. This manuscript includes definitions of terminology commonly used in the field of circadian physiology, outlines basic information on the human circadian system and how it is affected by time-givers, discusses the causes and consequences of travel fatigue and jet lag, and provides consensus on recommendations for managing travel fatigue and jet lag in athletes. The lack of evidence restricts the strength of recommendations that are possible but the consensus group identified the fundamental principles and interventions to consider for both the assessment and management of travel fatigue and jet lag. These are summarised in travel toolboxes including strategies for pre-flight, during flight and post-flight. The consensus group also outlined specific steps to advance theory and practice in these areas
    corecore